Active exploration in parameterized reinforcement learning

نویسندگان

Mehdi Khamassi

Costas Tzafestas

چکیده

Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an active exploration algorithm for RL in structured (parameterized) continuous action space. This framework deals with a set of discrete actions, each of which is parameterized with continuous variables. Discrete exploration is controlled through a Boltzmann softmax function with an inverse temperature β parameter. In parallel, a Gaussian exploration is applied to the continuous action parameters. We apply a meta-learning algorithm based on the comparison between variations of short-term and long-term reward running averages to simultaneously tune β and the width of the Gaussian distribution from which continuous action parameters are drawn. When applied to a simple virtual human-robot interaction task, we show that this algorithm outperforms continuous parameterized RL both without active exploration and with active exploration based on uncertainty variations measured by a Kalman-Q-learning algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Eecient Exploration in Reinforcement Learning

Exploration plays a fundamental role in any active learning system. This study evaluates the role of exploration in active learning and describes several local techniques for exploration in nite, discrete domains, embedded in a reinforcement learning framework (delayed reinforcement). This paper distinguishes between two families of exploration schemes: undirected and directed exploration. Whil...

متن کامل

Reinforcement Learning in Parameterized Models Reinforcement Learning with Polynomial Learning Rate in Parameterized Models

We consider reinforcement learning in a parameterized setup, where the model is known to belong to a finite set of Markov Decision Processes (MDPs) under the discounted return criterion. We propose an on-line algorithm for learning in such parameterized models, the Parameter Elimination (PEL) algorithm, and analyze its performance in terms of the total mistakes. The algorithm relies on Wald’s s...

متن کامل

cient Exploration In Reinforcement Learning Sebastian

متن کامل

Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

Intrinsically motivated spontaneous exploration is a key enabler of autonomous lifelong learning in human children. It allows them to discover and acquire large repertoires of skills through self-generation, self-selection, self-ordering and self-experimentation of learning goals. We present the unsupervised multi-goal reinforcement learning formal framework as well as an algorithmic approach c...

متن کامل

Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case

We consider reinforcement learning in the parameterized setup, where the model is known to belong to a parameterized family of Markov Decision Processes (MDPs). We further impose here the assumption that set of possible parameters is finite, and consider the discounted return. We propose an on-line algorithm for learning in such parameterized models, dubbed the Parameter Elimination (PEL) algor...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1610.01986 شماره

صفحات -

تاریخ انتشار 2016

Active exploration in parameterized reinforcement learning

نویسندگان

چکیده

منابع مشابه

Eecient Exploration in Reinforcement Learning

Reinforcement Learning in Parameterized Models Reinforcement Learning with Polynomial Learning Rate in Parameterized Models

cient Exploration In Reinforcement Learning Sebastian

Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case

عنوان ژورنال:

اشتراک گذاری